The em Algorithm for Kernel Matrix Completion with Auxiliary Data
نویسندگان
چکیده
In biological data, it is often the case that observed data are available only for a subset of samples. When a kernel matrix is derived from such data, we have to leave the entries for unavailable samples as missing. In this paper, the missing entries are completed by exploiting an auxiliary kernel matrix derived from another information source. The parametric model of kernel matrices is created as a set of spectral variants of the auxiliary kernel matrix, and the missing entries are estimated by fitting this model to the existing entries. For model fitting, we adopt the em algorithm (distinguished from the EM algorithm of Dempster et al., 1977) based on the information geometry of positive definite matrices. We will report promising results on bacteria clustering experiments using two marker sequences: 16S and gyrB.
منابع مشابه
Mutual Kernel Matrix Completion
With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods. However, among the many data completion techniques available in the lite...
متن کاملAn infeasible interior-point method for the $P*$-matrix linear complementarity problem based on a trigonometric kernel function with full-Newton step
An infeasible interior-point algorithm for solving the$P_*$-matrix linear complementarity problem based on a kernelfunction with trigonometric barrier term is analyzed. Each (main)iteration of the algorithm consists of a feasibility step andseveral centrality steps, whose feasibility step is induced by atrigonometric kernel function. The complexity result coincides withthe best result for infea...
متن کاملApproximating Incomplete Kernel Matrices by the em Algorithm
In biological data, it is often the case that observed data are available only for a subset of samples. When a kernel matrix is derived from such data, we have to leave the entries for unavailable samples as missing. In this paper, we make use of a parametric model of kernel matrices, and estimate missing entries by fitting the model to existing entries. The parametric model is created as a set...
متن کاملProtein Classification via Kernel Matrix Completion
The three-dimensional structure of a protein provides crucial information for predicting its function. However, as it is still a far more difficult and costly task to measure 3D coordinates of atoms in a protein than to sequence its amino acid composition, often we do not know the 3D structures of all the proteins at hand. Let us consider a kernel matrix that consists of kernel values represent...
متن کاملAlgebraic Variety Models for High-Rank Matrix Completion
We consider a generalization of low-rank matrix completion to the case where the data belongs to an algebraic variety, i.e., each data point is a solution to a system of polynomial equations. In this case the original matrix is possibly high-rank, but it becomes low-rank after mapping each column to a higher dimensional space of monomial features. Many well-studied extensions of linear models, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 4 شماره
صفحات -
تاریخ انتشار 2003